Speech emotion recognition with cross-lingual databases
نویسندگان
چکیده
In this paper, we investigate cross-lingual automatic speech emotion recognition. The basic idea is that since the emotion recognition system is based on the acoustic features only, it is possible to combine data in different languages to improve the recognition accuracy. We begin with the construction of a Mandarin database of emotional speech, which is similar to the well-known Berlin Database of Emotional Speech (EMO-DB) in the composition and size. In order to reduce the variability due to different languages and different speakers, we propose to apply histogram equalization as a data normalization method. Recognition systems based on support vector machines have been evaluated on EMO-DB. Compared to the baseline system without multi-lingual databases and data normalization, the proposed system has achieved a relative improvement of 39.9% in the emotion recognition accuracy, from 86.2% to 91.7%. The accuracy is among the best known results reported on EMO-DB, if not the best.
منابع مشابه
Analysis of Multi-Lingual Emotion Recognition Using Auditory Attention Features
In this paper, we build mono-lingual and cross-lingual emotion recognition systems and report performance on English and German databases. The emotion recognition system uses biologically inspired auditory attention features together with a neural network for learning the mapping between features and emotion classes. We first build mono-lingual systems for both Berlin Database of Emotional Spee...
متن کاملA preliminary study of cross-lingual emotion recognition from speech: automatic classification versus human perception
The aim of this study is to investigate the effect of cross-lingual data on human perception and automatic classification of emotion from speech. We use four different databases from three languages (English, Chinese, and German) and two types (acted and improvised). For automatic classification, there is a significant degradation using cross-corpus than within-corpus setup. For human perceptio...
متن کاملApproaching Multi-Lingual Emotion Recognition from Speech - On Language Dependency of Acoustic/Prosodic Features for Anger Detection
This paper reports on monoand cross-lingual performance of different acoustic and/or prosodic features. We analyze the way to define an optimal set of features when building a multilingual emotion classification system, i.e. a system that can handle more than a single input language. Due to our findings that cross-lingual emotion recognition suffers from low recognition rates we analyze our fea...
متن کاملCross-lingual and Multilingual Speech Emotion Recognition on English and French
Research on multilingual speech emotion recognition faces the problem that most available speech corpora differ from each other in important ways, such as annotation methods or interaction scenarios. These inconsistencies complicate building a multilingual system. We present results for crosslingual and multilingual emotion recognition on English and French speech data with similar characterist...
متن کاملApproaching Multi-Lingual Emotion Recognition from Speech - On Language Dependency of Acoustic/Prosodic Features for Anger Recognition
In this paper, we describe experiments on automatic Emotion Recognition using comparable speech corpora collected from real-life American English and German Interactive Voice Response systems. We compute the optimal set of acoustic and prosodic features for mono-, crossand multi-lingual anger recognition, and analyze the differences. When an emotion recognition system is confronted with a langu...
متن کامل